Skip to content

Support multi-layer algorithms that receive a data product from an unfold#571

Open
knoepfel wants to merge 4 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure
Open

Support multi-layer algorithms that receive a data product from an unfold#571
knoepfel wants to merge 4 commits into
Framework-R-D:mainfrom
knoepfel:multi-layer-failure

Conversation

@knoepfel
Copy link
Copy Markdown
Member

@knoepfel knoepfel commented May 6, 2026

Summary

This PR fixes a bug where transforms operating on a hierarchy layer downstream of an unfold could not receive data products produced by that unfold. The root cause was that the index_router had no mechanism to learn — from the unfold itself — how many children it had generated, so it could not emit the correct flush token at the right time.

Solution

The fix introduces two new components and refactors the flush/routing pipeline.

New: data_cell_tracker (phlex/model/)

Tracks the sequence of incoming data_cell_index values from the driver and determines which flush tokens are ready to be emitted when a new index arrives or the job ends. This logic was previously implicit in the source/index-router interaction.

New: child_tracker (phlex/model/)

Per-index tracker that tracks how many children have been processed across each child layer. Once the expected count (received from the unfold's flush result) is satisfied, child_tracker fires a callback to emit the flush token for that index.

declared_unfold

Now uses a third TBB output port (index_flush) carrying the child counts alongside the existing message and index-message ports.

index_router

Gains a flush_receiver input and an establish_layers() initializer that records which layers are produced by unfolds vs. consumed as inputs. route() now accepts the closeout flushes from data_cell_tracker rather than computing them internally, and drain() likewise receives the remaining flushes at job end.

framework_graph

  • Introduces a index_receiver_ node that decouples closeout flush emission from the source node.
  • Instantiates a data_cell_tracker (cell_tracker_) and feeds its output into index_router_.route().
  • Calls index_router_.establish_layers() before finalize(), using layer metadata gathered from the declared unfolds.
  • Connects each unfold's flush_sender() to index_router_.flush_receiver().

edge_maker

Refactored to return (provider_input_ports, multilayer_join_index_ports) instead of directly calling index_router_.finalize(), allowing framework_graph to call establish_layers() first.

Tests

  • test/data_cell_tracker_test.cpp — unit tests for data_cell_tracker in isolation.
  • test/child_tracker_test.cpp — unit-level tests for child_tracker verifying that committed child counts accumulate correctly through nested unfolds without involving the full framework_graph machinery.
  • test/fold_duplicate_layer_name_test.cpp — re-enables the previously disabled different_hierarchies test, now covering the case where two layers share the same name in a fold scenario.
  • test/unfold.cpp — additional unfold scenarios exercising multi-layer product consumption.

Files changed

Area Files
New model phlex/model/data_cell_tracker.hpp, phlex/model/data_cell_tracker.cpp, phlex/model/child_tracker.hpp, phlex/model/child_tracker.cpp
Modified core phlex/core/index_router.hpp, phlex/core/index_router.cpp, phlex/core/framework_graph.hpp, phlex/core/framework_graph.cpp, phlex/core/declared_unfold.hpp, phlex/core/edge_maker.hpp, phlex/model/fixed_hierarchy.hpp, phlex/model/fixed_hierarchy.cpp
New tests test/data_cell_tracker_test.cpp, test/child_tracker_test.cpp, test/fold_duplicate_layer_name_test.cpp
Modified tests test/unfold.cpp

Summary largely courtesy of Claude Sonnet 4.6

Resolves #359

@knoepfel knoepfel force-pushed the multi-layer-failure branch from 381d7c1 to fa25cb1 Compare May 6, 2026 21:48
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 94.77124% with 16 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
phlex/core/index_router.cpp 92.71% 6 Missing and 5 partials ⚠️
phlex/model/data_cell_tracker.cpp 91.11% 0 Missing and 4 partials ⚠️
phlex/core/framework_graph.cpp 97.87% 0 Missing and 1 partial ⚠️
@@            Coverage Diff             @@
##             main     #571      +/-   ##
==========================================
+ Coverage   82.11%   82.61%   +0.50%     
==========================================
  Files         157      161       +4     
  Lines        5720     5891     +171     
  Branches      646      686      +40     
==========================================
+ Hits         4697     4867     +170     
+ Misses        808      807       -1     
- Partials      215      217       +2     
Flag Coverage Δ
scripts 76.12% <ø> (ø)
unittests 82.61% <94.77%> (+0.50%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
phlex/core/declared_unfold.cpp 100.00% <100.00%> (+16.66%) ⬆️
phlex/core/declared_unfold.hpp 96.36% <100.00%> (-0.07%) ⬇️
phlex/core/edge_maker.hpp 100.00% <100.00%> (ø)
phlex/core/framework_graph.hpp 100.00% <ø> (ø)
phlex/core/index_router.hpp 100.00% <100.00%> (ø)
phlex/model/child_tracker.cpp 100.00% <100.00%> (ø)
phlex/model/child_tracker.hpp 100.00% <100.00%> (ø)
phlex/model/data_cell_tracker.hpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.cpp 100.00% <100.00%> (ø)
phlex/model/fixed_hierarchy.hpp 100.00% <100.00%> (ø)
... and 3 more

... and 4 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 10694ca...143de65. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@knoepfel knoepfel force-pushed the multi-layer-failure branch 12 times, most recently from eb2cf70 to f4599e5 Compare May 12, 2026 20:49
@Framework-R-D Framework-R-D deleted a comment from greenc-FNAL May 12, 2026
@knoepfel knoepfel force-pushed the multi-layer-failure branch from f4599e5 to 8cac667 Compare May 14, 2026 14:28
@knoepfel knoepfel force-pushed the multi-layer-failure branch 6 times, most recently from 3bcd7a6 to 1e0b7a6 Compare May 14, 2026 20:07
@knoepfel knoepfel force-pushed the multi-layer-failure branch from 1e0b7a6 to 143de65 Compare May 14, 2026 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change fold-result caching to use multilayer_join_node

1 participant